Subspace Gaussian Mixture Models for Large Vocabulary Speech Recognition
نویسنده
چکیده
Subspace Gaussian mixture model(GMM) is an alternative approach to approximate the probabilistic density function (p.d.f) of a set of independent identical distributed (i.i.d) data with prior density estimates. In this approach, the prior density of GMM parameters is estimated from a development dataset, and when predict the new enrolled data, the prior knowledge can be utilised by criteria like Maximum a Posterior. Unlike the conventional prior estimate method for GMM, the correlations between parameters of different Gaussian components are considered in this approach. In order to handle the large size of parameter set and meanwhile to ensure the priors be informative, the prior density estimation is constraint to a low dimensional subspace of the whole model space which can capture the main model variations. The subspace GMM has already been successfully applied in the task of speaker recognition, and achieved promising performance, but there is no much work of applying this approach to speech recognition. In this paper, we will present a new framework of HMM based speech recognition system based subspace GMM, in which, the parameters of state-dependent GMM are not estimated separately but been generated from the globally shared low dimensional model subspace. The approach can considerably reduce the model size and in addition, make the speech recognition system more scalable and adaptable. In this paper, we will first review the principles of subspace GMM approach based on its applications in speaker recognition and then discuss how to extend it to the task of speech recognition.
منابع مشابه
Microsoft Word - Hybridmodel2.dot
Today’s state-of-the-art speech recognition systems typically use continuous density hidden Markov models with mixture of Gaussian distributions. Such speech recognition systems have problems; they require too much memory to run, and are too slow for large vocabulary applications. Two approaches are proposed for the design of compact acoustic models, namely, subspace distribution clustering hid...
متن کاملTper Hcaeser Pidi Application of Subspace Gaussian Mixture Models in Contrastive Acoustic Scenarios
This paper describes experimental results of applying Subspace Gaussian Mixture Models (SGMMs) in two completely diverse acoustic scenarios: (a) for Large Vocabulary Continuous Speech Recognition (LVCSR) task over (well-resourced) English meeting data and, (b) for acoustic modeling of underresourced Afrikaans telephone data. In both cases, the performance of SGMM models is compared with a conve...
متن کاملCombating reverberation in large vocabulary continuous speech recognition
Reverberation leads to high word error rates (WERs) for automatic speech recognition (ASR) systems. This work presents robust acoustic features motivated by subspace modeling and human speech perception for use in large vocabulary continuous speech recognition (LVCSR). We explore different acoustic modeling strategies and language modeling techniques, and demonstrate that robust features with a...
متن کاملSpoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting
Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...
متن کاملLarge vocabulary conversational speech recognition with a subspace constraint on inverse covariance matrices
This paper applies the recently proposed SPAM models for acoustic modeling in a Speaker Adaptive Training (SAT) context on large vocabulary conversational speech databases, including the Switchboard database. SPAM models are Gaussian mixture models in which a subspace constraint is placed on the precision and mean matrices (although this paper focuses on the case of unconstrained means). They i...
متن کامل